AITopics | convolution algorithm

Collaborating Authors

convolution algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Causes and Effects of Unanticipated Numerical Deviations in Neural Network Inference Frameworks

Neural Information Processing SystemsFeb-16-2026, 13:50:46 GMT

Hardware-specific optimizations in machine learning (ML) frameworks can cause numerical deviations of inference results. Quite surprisingly, despite using a fixed trained model and fixed input data, inference results are not consistent across platforms, and sometimes not even deterministic on the same platform. We study the causes of these numerical deviations for convolutional neural networks (CNN) on realistic end-to-end inference pipelines and in isolated experiments. Results from 75 distinct platforms suggest that the main causes of deviations on CPUs are differences in SIMD use, and the selection of convolution algorithms at runtime on GPUs. We link the causes and propagation effects to properties of the ML model and evaluate potential mitigations. We make our research code publicly available.

artificial intelligence, deviation, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Austria > Tyrol > Innsbruck (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Reviews: Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution

Neural Information Processing SystemsOct-7-2024, 09:23:40 GMT

I am happy with the responses they have provided to my initial concerns which have improved the manuscript. I would encourage authors to add an appendix should they believe they can convey a more complete message without having to wait for drafting another another longer manuscript. They utilise the new Acceleration Network (AccNet) to express the approximation function of splatting/blurring/slicing (SBS) and generalise SBS by changing the architecture of AccNet. Upon finishing the training process, the proposed fast convolution algorithm can be derived from the weights and activation functions of each layer. They conducted various experiments that prove the effectiveness of the proposed algorithm.

acceleration neural network, algorithm, fast high-dimensional convolution, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)

Add feedback

SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic

He, Liulu, Zhao, Yufei, Gao, Rui, Du, Yuan, Du, Li

arXiv.org Artificial IntelligenceJul-3-2024

Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models. However, these algorithms depend on high-precision arithmetic to maintain inference accuracy, which conflicts with the model quantization. To resolve this conflict and further improve the efficiency of quantized convolution, we proposes SFC, a new algebra transform for fast convolution by extending the Discrete Fourier Transform (DFT) with symbolic computing, in which only additions are required to perform the transformation at specific transform points, avoiding the calculation of irrational number and reducing the requirement for precision. Additionally, we enhance convolution efficiency by introducing correction terms to convert invalid circular convolution outputs of the Fourier method into effective ones. The numerical error analysis is presented for the first time in this type of work and proves that our algorithms can provide a 3.68x multiplication reduction for 3x3 convolution, while the Winograd algorithm only achieves a 2.25x reduction with similarly low numerical errors. Experiments carried out on benchmarks and FPGA show that our new algorithms can further improve the computation efficiency of quantized models while maintaining accuracy, surpassing both the quantization-alone method and existing works on fast convolution quantization.

algorithm, convolution, quantization, (15 more...)

arXiv.org Artificial Intelligence

2407.02913

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

HyPHEN: A Hybrid Packing Method and Optimizations for Homomorphic Encryption-Based Neural Networks

Kim, Donghwan, Park, Jaiyoung, Kim, Jongmin, Kim, Sangpyo, Ahn, Jung Ho

arXiv.org Artificial IntelligenceDec-8-2023

Convolutional neural network (CNN) inference using fully homomorphic encryption (FHE) is a promising private inference (PI) solution due to the capability of FHE that enables offloading the whole computation process to the server while protecting the privacy of sensitive user data. Prior FHE-based CNN (HCNN) work has demonstrated the feasibility of constructing deep neural network architectures such as ResNet using FHE. Despite these advancements, HCNN still faces significant challenges in practicality due to the high computational and memory overhead. To overcome these limitations, we present HyPHEN, a deep HCNN construction that incorporates novel convolution algorithms (RAConv and CAConv), data packing methods (2D gap packing and PRCR scheme), and optimization techniques tailored to HCNN construction. Such enhancements enable HyPHEN to substantially reduce the memory footprint and the number of expensive homomorphic operations, such as ciphertext rotation and bootstrapping. As a result, HyPHEN brings the latency of HCNN CIFAR-10 inference down to a practical level at 1.4 seconds (ResNet-20) and demonstrates HCNN ImageNet inference for the first time at 14.7 seconds (ResNet-18).

ciphertext, convolution, opération, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2023.3348170

2302.02407

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Maryland > Baltimore (0.14)
North America > Canada > Quebec > Montreal (0.04)
(14 more...)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Im2win: Memory Efficient Convolution On SIMD Architectures

Lu, Shuai, Chu, Jun, Liu, Xu T.

arXiv.org Artificial IntelligenceJun-25-2023

Convolution is the most expensive operation among neural network operations, thus its performance is critical to the overall performance of neural networks. Commonly used convolution approaches, including general matrix multiplication (GEMM)-based convolution and direct convolution, rely on im2col for data transformation or do not use data transformation at all, respectively. However, the im2col data transformation can lead to at least 2$\times$ memory footprint compared to not using data transformation at all, thus limiting the size of neural network models running on memory-limited systems. Meanwhile, not using data transformation usually performs poorly due to nonconsecutive memory access although it consumes less memory. To solve those problems, we propose a new memory-efficient data transformation algorithm, called im2win. This algorithm refactorizes a row of square or rectangle dot product windows of the input image and flattens unique elements within these windows into a row in the output tensor, which enables consecutive memory access and data reuse, and thus greatly reduces the memory overhead. Furthermore, we propose a high-performance im2win-based convolution algorithm with various optimizations, including vectorization, loop reordering, etc. Our experimental results show that our algorithm reduces the memory overhead by average to 41.6% compared to the PyTorch's convolution implementation based on im2col, and achieves average to 3.6$\times$ and 5.3$\times$ speedup in performance compared to the im2col-based convolution and not using data transformation, respectively.

convolution, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/HPEC55821.2022.9926408

2306.1432

Country:

Asia > China > Jiangxi Province > Nanchang (0.05)
North America > United States (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Im2win: An Efficient Convolution Paradigm on GPU

Lu, Shuai, Chu, Jun, Guo, Luanzheng, Liu, Xu T.

arXiv.org Artificial IntelligenceJun-25-2023

Convolution is the most time-consuming operation in deep neural network operations, so its performance is critical to the overall performance of the neural network. The commonly used methods for convolution on GPU include the general matrix multiplication (GEMM)-based convolution and the direct convolution. GEMM-based convolution relies on the im2col algorithm, which results in a large memory footprint and reduced performance. Direct convolution does not have the large memory footprint problem, but the performance is not on par with GEMM-based approach because of the discontinuous memory access. This paper proposes a window-order-based convolution paradigm on GPU, called im2win, which not only reduces memory footprint but also offers continuous memory accesses, resulting in improved performance. Furthermore, we apply a range of optimization techniques on the convolution CUDA kernel, including shared memory, tiling, micro-kernel, double buffer, and prefetching. We compare our implementation with the direct convolution, and PyTorch's GEMM-based convolution with cuBLAS and six cuDNN-based convolution implementations, with twelve state-of-the-art DNN benchmarks. The experimental results show that our implementation 1) uses less memory footprint by 23.1% and achieves 3.5$\times$ TFLOPS compared with cuBLAS, 2) uses less memory footprint by 32.8% and achieves up to 1.8$\times$ TFLOPS compared with the best performant convolutions in cuDNN, and 3) achieves up to 155$\times$ TFLOPS compared with the direct convolution. We further perform an ablation study on the applied optimization techniques and find that the micro-kernel has the greatest positive impact on performance.

artificial intelligence, convolution, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.14316

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > Merced County > Merced (0.14)
Asia > China > Jiangxi Province > Nanchang (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hyperspectral Remote Sensing Image Classification Based on Multi-scale Cross Graphic Convolution

Zhao, Yunsong, Li, Yin, Chen, Zhihan, Qiu, Tianchong, Liu, Guojin

arXiv.org Artificial IntelligenceJun-28-2021

The mining and utilization of features directly affect the classification performance of models used in the classification and recognition of hyperspectral remote sensing images. Traditional models usually conduct feature mining from a single perspective, with the features mined being limited and the internal relationships between them being ignored. Consequently, useful features are lost and classification results are unsatisfactory. To fully mine and utilize image features, a new multi-scale feature-mining learning algorithm (MGRNet) is proposed. The model uses principal component analysis to reduce the dimensionality of the original hyperspectral image (HSI) to retain 99.99% of its semantic information and extract dimensionality reduction features. Using a multi-scale convolution algorithm, the input dimensionality reduction features were mined to obtain shallow features, which then served as inputs into a multi-scale graph convolution algorithm to construct the internal relationships between eigenvalues at different scales. We then carried out cross fusion of multi-scale information obtained by graph convolution, before inputting the new information obtained into the residual network algorithm for deep feature mining. Finally, a flexible maximum transfer function classifier was used to predict the final features and complete the classification. Experiments on three common hyperspectral datasets showed the MGRNet algorithm proposed in this paper to be superior to traditional methods in recognition accuracy.

algorithm, convolution, dataset, (13 more...)

arXiv.org Artificial Intelligence

2106.14804

Country:

Asia > China > Chongqing Province > Chongqing (0.05)
Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.65)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

cuConv: A CUDA Implementation of Convolution for CNN Inference

Jordà, Marc, Valero-Lara, Pedro, Peña, Antonio J.

arXiv.org Artificial IntelligenceMar-30-2021

Convolutions are the core operation of deep learning applications based on Convolutional Neural Networks (CNNs). Current GPU architectures are highly efficient for training and deploying deep CNNs, and hence, these are largely used in production for this purpose. State-of-the-art implementations, however, present a lack of efficiency for some commonly used network configurations. In this paper we propose a GPU-based implementation of the convolution operation for CNN inference that favors coalesced accesses, without requiring prior data transformations. Our experiments demonstrate that our proposal yields notable performance improvements in a range of common CNN forward propagation convolution configurations, with speedups of up to 2.29x with respect to the best implementation of convolution in cuDNN, hence covering a relevant region in currently existing approaches.

artificial intelligence, configuration, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10586-021-03494-y

2103.16234

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > France (0.04)

Genre: Research Report (0.65)

Industry: Information Technology (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CVML Live Web-Lecture Series – Icarus

#artificialintelligenceDec-12-2019, 09:17:21 GMT

CVML Live Web Lecture Series Concept Artificial Intelligence and Information analysis (AIIA) Lab, AUTH is proud to launch the live CVML Web lecture series that will cover very important topics Computer vision/machine learning. Top scientists internationally will deliver these lectures, aiming at providing in-depth knowledge on various CVML topics. The 1-hour lectures will take place on Saturdays, to avoid conflicts with other intended registrant schedules/duties: a) Saturdays 11:00 EET (17:00 Beijing time) and b) Saturdays 20:00 EET (13:00 EST, 10:00 PST for NY/LA, respectively) for audience in the Americas. Each lecture will be announced at least 1 week in advance in various relevant email lists and in this page. Lectures will consist primarily of live lecture streaming and PPT slides.

algorithm, convolution, convolution algorithm, (14 more...)

#artificialintelligence

Country:

Asia > China > Beijing > Beijing (0.26)
Europe > Greece > Central Macedonia > Thessaloniki (0.06)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.05)
Asia > India (0.05)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education > Educational Setting (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Filters

Collaborating Authors

convolution algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

af076c3bdbf935b81d808e37c5ede463-Paper-Conference.pdf

Causes and Effects of Unanticipated Numerical Deviations in Neural Network Inference Frameworks

Reviews: Designing by Training: Acceleration Neural Network for Fast High-Dimensional Convolution

SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic

HyPHEN: A Hybrid Packing Method and Optimizations for Homomorphic Encryption-Based Neural Networks

Im2win: Memory Efficient Convolution On SIMD Architectures

Im2win: An Efficient Convolution Paradigm on GPU

Hyperspectral Remote Sensing Image Classification Based on Multi-scale Cross Graphic Convolution

cuConv: A CUDA Implementation of Convolution for CNN Inference

CVML Live Web-Lecture Series – Icarus